43 research outputs found

    Join-Idle-Queue with Service Elasticity: Large-Scale Asymptotics of a Non-monotone System

    Get PDF
    We consider the model of a token-based joint auto-scaling and load balancing strategy, proposed in a recent paper by Mukherjee, Dhara, Borst, and van Leeuwaarden (SIGMETRICS '17, arXiv:1703.08373), which offers an efficient scalable implementation and yet achieves asymptotically optimal steady-state delay performance and energy consumption as the number of servers Nβ†’βˆžN\to\infty. In the above work, the asymptotic results are obtained under the assumption that the queues have fixed-size finite buffers, and therefore the fundamental question of stability of the proposed scheme with infinite buffers was left open. In this paper, we address this fundamental stability question. The system stability under the usual subcritical load assumption is not automatic. Moreover, the stability may not even hold for all NN. The key challenge stems from the fact that the process lacks monotonicity, which has been the powerful primary tool for establishing stability in load balancing models. We develop a novel method to prove that the subcritically loaded system is stable for large enough NN, and establish convergence of steady-state distributions to the optimal one, as Nβ†’βˆžN \to \infty. The method goes beyond the state of the art techniques -- it uses an induction-based idea and a "weak monotonicity" property of the model; this technique is of independent interest and may have broader applicability.Comment: 30 page

    Scalable Load Balancing Algorithms in Networked Systems

    Get PDF
    A fundamental challenge in large-scale networked systems viz., data centers and cloud networks is to distribute tasks to a pool of servers, using minimal instantaneous state information, while providing excellent delay performance. In this thesis we design and analyze load balancing algorithms that aim to achieve a highly efficient distribution of tasks, optimize server utilization, and minimize communication overhead.Comment: Ph.D. thesi

    Supermarket Model on Graphs

    Get PDF
    We consider a variation of the supermarket model in which the servers can communicate with their neighbors and where the neighborhood relationships are described in terms of a suitable graph. Tasks with unit-exponential service time distributions arrive at each vertex as independent Poisson processes with rate Ξ»\lambda, and each task is irrevocably assigned to the shortest queue among the one it first appears and its dβˆ’1d-1 randomly selected neighbors. This model has been extensively studied when the underlying graph is a clique in which case it reduces to the well known power-of-dd scheme. In particular, results of Mitzenmacher (1996) and Vvedenskaya et al. (1996) show that as the size of the clique gets large, the occupancy process associated with the queue-lengths at the various servers converges to a deterministic limit described by an infinite system of ordinary differential equations (ODE). In this work, we consider settings where the underlying graph need not be a clique and is allowed to be suitably sparse. We show that if the minimum degree approaches infinity (however slowly) as the number of servers NN approaches infinity, and the ratio between the maximum degree and the minimum degree in each connected component approaches 1 uniformly, the occupancy process converges to the same system of ODE as the classical supermarket model. In particular, the asymptotic behavior of the occupancy process is insensitive to the precise network topology. We also study the case where the graph sequence is random, with the NN-th graph given as an Erd\H{o}s-R\'enyi random graph on NN vertices with average degree c(N)c(N). Annealed convergence of the occupancy process to the same deterministic limit is established under the condition c(N)β†’βˆžc(N)\to\infty, and under a stronger condition c(N)/ln⁑Nβ†’βˆžc(N)/\ln N\to\infty, convergence (in probability) is shown for almost every realization of the random graph.Comment: 32 page

    Phase transitions of extremal cuts for the configuration model

    Get PDF
    The kk-section width and the Max-Cut for the configuration model are shown to exhibit phase transitions according to the values of certain parameters of the asymptotic degree distribution. These transitions mirror those observed on Erd\H{o}s-R\'enyi random graphs, established by Luczak and McDiarmid (2001), and Coppersmith et al. (2004), respectively

    Optimal Rate-Matrix Pruning For Large-Scale Heterogeneous Systems

    Full text link
    We present an analysis of large-scale load balancing systems, where the processing time distribution of tasks depends on both the task and server types. Our study focuses on the asymptotic regime, where the number of servers and task types tend to infinity in proportion. In heterogeneous environments, commonly used load balancing policies such as Join Fastest Idle Queue and Join Fastest Shortest Queue exhibit poor performance and even shrink the stability region. Interestingly, prior to this work, finding a scalable policy with a provable performance guarantee in this setup remained an open question. To address this gap, we propose and analyze two asymptotically delay-optimal dynamic load balancing policies. The first policy efficiently reserves the processing capacity of each server for ``good" tasks and routes tasks using the vanilla Join Idle Queue policy. The second policy, called the speed-priority policy, significantly increases the likelihood of assigning tasks to the respective ``good" servers capable of processing them at high speeds. By leveraging a framework inspired by the graphon literature and employing the mean-field method and stochastic coupling arguments, we demonstrate that both policies achieve asymptotic zero queuing. Specifically, as the system scales, the probability of a typical task being assigned to an idle server approaches 1

    Large deviations analysis for the M/H2/n+MM/H_2/n + M queue in the Halfin-Whitt regime

    Full text link
    We consider the FCFS M/H2/n+MM/H_2/n + M queue in the Halfin-Whitt heavy traffic regime. It is known that the normalized sequence of steady-state queue length distributions is tight and converges weakly to a limiting random variable W. However, those works only describe W implicitly as the invariant measure of a complicated diffusion. Although it was proven by Gamarnik and Stolyar that the tail of W is sub-Gaussian, the actual value of lim⁑xβ†’βˆžxβˆ’2log⁑(P(W>x))\lim_{x \rightarrow \infty}x^{-2}\log(P(W >x)) was left open. In subsequent work, Dai and He conjectured an explicit form for this exponent, which was insensitive to the higher moments of the service distribution. We explicitly compute the true large deviations exponent for W when the abandonment rate is less than the minimum service rate, the first such result for non-Markovian queues with abandonments. Interestingly, our results resolve the conjecture of Dai and He in the negative. Our main approach is to extend the stochastic comparison framework of Gamarnik and Goldberg to the setting of abandonments, requiring several novel and non-trivial contributions. Our approach sheds light on several novel ways to think about multi-server queues with abandonments in the Halfin-Whitt regime, which should hold in considerable generality and provide new tools for analyzing these systems

    Join-the-Shortest Queue Diffusion Limit in Halfin-Whitt Regime: Tail Asymptotics and Scaling of Extrema

    Get PDF
    Consider a system of NN parallel single-server queues with unit-exponential service time distribution and a single dispatcher where tasks arrive as a Poisson process of rate Ξ»(N)\lambda(N). When a task arrives, the dispatcher assigns it to one of the servers according to the Join-the-Shortest Queue (JSQ) policy. Eschenfeldt and Gamarnik (2015) established that in the Halfin-Whitt regime where (Nβˆ’Ξ»(N))/Nβ†’Ξ²>0(N-\lambda(N))/\sqrt{N}\to\beta>0 as Nβ†’βˆžN\to\infty, appropriately scaled occupancy measure of the system under the JSQ policy converges weakly on any finite time interval to a certain diffusion process as Nβ†’βˆžN\to\infty. Recently, it was further established by Braverman (2018) that the stationary occupancy measure of the system converges weakly to the steady state of the diffusion process as Nβ†’βˆžN\to\infty. In this paper we perform a detailed analysis of the steady state of the above diffusion process. Specifically, we establish precise tail-asymptotics of the stationary distribution and scaling of extrema of the process on large time-interval. Our results imply that the asymptotic steady-state scaled number of servers with queue length two or larger exhibits an Exponential tail, whereas that for the number of idle servers turns out to be Gaussian. From the methodological point of view, the diffusion process under consideration goes beyond the state-of-the-art techniques in the study of the steady-state of diffusion processes. Lack of any closed form expression for the steady state and intricate interdependency of the process dynamics on its local times make the analysis significantly challenging. We develop a technique involving the theory of regenerative processes that provides a tractable form for the stationary measure, and in conjunction with several sharp hitting time estimates, acts as a key vehicle in establishing the results.Comment: 41 pages; To appear in the Annals of Applied Probabilit

    Distributed Rate Scaling in Large-Scale Service Systems

    Full text link
    We consider a large-scale parallel-server system, where each server independently adjusts its processing speed in a decentralized manner. The objective is to minimize the overall cost, which comprises the average cost of maintaining the servers' processing speeds and a non-decreasing function of the tasks' sojourn times. The problem is compounded by the lack of knowledge of the task arrival rate and the absence of a centralized control or communication among the servers. We draw on ideas from stochastic approximation and present a novel rate scaling algorithm that ensures convergence of all server processing speeds to the globally asymptotically optimum value as the system size increases. Apart from the algorithm design, a key contribution of our approach lies in demonstrating how concepts from the stochastic approximation literature can be leveraged to effectively tackle learning problems in large-scale, distributed systems. En route, we also analyze the performance of a fully heterogeneous parallel-server system, where each server has a distinct processing speed, which might be of independent interest.Comment: 32 pages, 4 figure

    Asymptotically Optimal Load Balancing Topologies

    Full text link
    We consider a system of NN servers inter-connected by some underlying graph topology GNG_N. Tasks arrive at the various servers as independent Poisson processes of rate Ξ»\lambda. Each incoming task is irrevocably assigned to whichever server has the smallest number of tasks among the one where it appears and its neighbors in GNG_N. Tasks have unit-mean exponential service times and leave the system upon service completion. The above model has been extensively investigated in the case GNG_N is a clique. Since the servers are exchangeable in that case, the queue length process is quite tractable, and it has been proved that for any Ξ»<1\lambda < 1, the fraction of servers with two or more tasks vanishes in the limit as Nβ†’βˆžN \to \infty. For an arbitrary graph GNG_N, the lack of exchangeability severely complicates the analysis, and the queue length process tends to be worse than for a clique. Accordingly, a graph GNG_N is said to be NN-optimal or N\sqrt{N}-optimal when the occupancy process on GNG_N is equivalent to that on a clique on an NN-scale or N\sqrt{N}-scale, respectively. We prove that if GNG_N is an Erd\H{o}s-R\'enyi random graph with average degree d(N)d(N), then it is with high probability NN-optimal and N\sqrt{N}-optimal if d(N)β†’βˆžd(N) \to \infty and d(N)/(Nlog⁑(N))β†’βˆžd(N) / (\sqrt{N} \log(N)) \to \infty as Nβ†’βˆžN \to \infty, respectively. This demonstrates that optimality can be maintained at NN-scale and N\sqrt{N}-scale while reducing the number of connections by nearly a factor NN and N/log⁑(N)\sqrt{N} / \log(N) compared to a clique, provided the topology is suitably random. It is further shown that if GNG_N contains Θ(N)\Theta(N) bounded-degree nodes, then it cannot be NN-optimal. In addition, we establish that an arbitrary graph GNG_N is NN-optimal when its minimum degree is Nβˆ’o(N)N - o(N), and may not be NN-optimal even when its minimum degree is cN+o(N)c N + o(N) for any 0<c<1/20 < c < 1/2.Comment: A few relevant results from arXiv:1612.00723 are included for convenienc
    corecore